Population class extension: dataIO module

File containing the class extension for the population object that contains data input-output (IO) functions

class binarycpython.utils.population_extensions.dataIO.dataIO(**kwargs)[source]

Bases: object

Class extension for the population object that contains data input-output (IO) functions

NFS_flush_hack(filename)[source]

Use opendir()/closedir() to flush NFS access to a file.

NOTE: this may or may not work!

TODO: This function leads to a complaint about unclosed scandir operators. Check if that can be resolved.

NFSpath(path)[source]

Test path to see if it’s on an NFS mount.

Parameters: path – the path to be tested
Returns: if on an NFS mount point. False : if not. None : if the path does not exist.
Return type: True

compression_type(filename)[source]: Return the compression type of the ensemble file, based on its filename extension.

dir_ok(directory)[source]: Function to test if we can read and write to a directory that must exist. Return True if all is ok, False otherwise.

load_population_object(filename)[source]: returns the Population object loaded from filename

load_snapshot(file)[source]: Load a snapshot from file and set it in the preloaded_population placeholder.

locked_close(file, lock)[source]

Partner function to locked_open_for_write()

Closes and unlocks the file

locked_open_for_write(filename, encoding='utf-8', lock_suffix='.lock', lock_timeout=5, lock_lifetime=60, exists_ok=False, fatal_open_errors=True, vb=False, **kwargs)[source]

Wrapper for Python’s open(filename) which opens a file at filename for writing (mode “w”) and locks it.

We check whether the file’s lockfile already exists, in which case just return (None,None), and if we cannot obtain a lock on the file we also return (None,None).

If the file does not exist, we keep trying to lock until it does.

To do the locking, we use flufl.lock which is NFS safe.

Parameters

lock_lifetime – (passed to flufl.lock.Lock()) default 60 seconds. It should take less than this time to write the file.
lock_timeout – (passed to flufl.lock.Lock()) default 5 seconds. This should be non-zero.
fatal_open_errors – if open() fails and fatal_open_errors is True, exit.
exists_ok – if False and the file at filename exists, return (None,None) (default False)
vb – verbose logging if True, defaults to False

Returns

(file_object, lock_object) tuple. If the file was not opened, returns (None,None).

merge_populations(refpop, newpop)[source]

merge newpop’s results data into refpop’s results data

Parameters

refpop – the original “reference” Population object to be added to
newpop – Population object containing the new data

Returns

nothing

Note

The file should be saved using save_population_object()

merge_populations_from_file(refpop, filename)[source]

Wrapper for merge_populations so it can be done directly from a file.

Parameters

refpop – the original “reference” Population object to be added to
filename – file containing the Population object containing the new data

Note

The file should be saved using save_population_object()

open(file, mode='r', buffering=- 1, encoding=None, errors=None, newline=None, closefd=True, opener=None, compression=None, compresslevel=None, vb=False)[source]: Wrapper for open() with automatic compression based on the file extension.

save_population_object(population_object=None, filename=None, confirmation=True, compression='gzip')[source]

Save pickled Population object to file at filename or, if filename is None, whatever is set at self.population_options[‘save_population_object’]

Parameters

population_object – the object to be saved to the file. If population_object is None, use self.
filename – the name of the file to be saved. If not set, use self.population_options[‘save_population_object’]
confirmation – if True, a file “filename.saved” is touched just after the dump, so we know it is finished. TODO: fix this
compression (optional, default = "gzip") – TODO: fix this

Compression is performed according to the filename, as stated in the compress_pickle documentation at https://lucianopaz.github.io/compress_pickle/html/

Shared memory, stored in the population_object.shared_memory dict, is not saved.

TODO: this function isnt called correctly. grep and find the calls

save_snapshot(file=None)[source]: Save the population object to a snapshot file, automatically choosing the filename if none is given.

set_status(string, format_statment='process_{}.txt', ID=None)[source]: Function to set the status string in its appropriate file

snapshot_filename()[source]: Automatically choose the snapshot filename.

wait_for_unlock(filename, lock_suffix='.lock')[source]

Companion to locked_open_for_write that waits for a filename to a) exist and b) be unlocked.

This should work because the lock file is created before the file is created.

write_binary_c_calls_to_file(output_dir=None, output_filename=None, include_defaults=False, encoding='utf-8')[source]

Function that loops over the grid code and writes the generated parameters to a file. In the form of a command line call

Only useful when you have a variable grid as system_generator. MC wouldn’t be that useful

Also, make sure that in this export there are the basic parameters like m1,m2,sep, orb-per, ecc, probability etc. TODO: this function can probably be cleaned a bit and can rely on the other startup and clean up functions (see population_class) On default this will write to the datadir, if it exists

Parameters

output_dir (Optional[str]) – (optional, default = None) directory where to write the file to. If custom_options[‘data_dir’] is present, then that one will be used first, and then the output_dir
output_filename (Optional[str]) – (optional, default = None) filename of the output. If not set it will be called “binary_c_calls.txt”
include_defaults (bool) – (optional, default = None) whether to include the defaults of binary_c in the lines that are written. Beware that this will result in very long lines, and it might be better to just export the binary_c defaults and keep them in a separate file.

Returns

filename that was used to write the calls to

Return type

filename

write_ensemble(output_file, data=None, sort_keys=True, indent=4, encoding='utf-8', ensure_ascii=False)[source]

write_ensemble : Write ensemble results to a file.

Parameters

output_file –
the output filename.

If the filename has an extension that we recognise, e.g. .gz or .bz2, we compress the output appropriately.

The filename should contain .json or .msgpack, the two currently-supported formats.

Usually you’ll want to output to JSON, but we can also output to msgpack.
data – the data dictionary to be converted and written to the file. If not set, this defaults to self.grid_ensemble_results.
sort_keys – if True, and output is to JSON, the keys will be sorted. (default: True, passed to json.dumps)
indent – number of space characters used in the JSON indent. (Default: 4, passed to json.dumps)
encoding – file encoding method, usually defaults to ‘utf-8’
ensure_ascii – the ensure_ascii flag passed to json.dump and/or json.dumps (Default: False)